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Abstract 

In competitions involving many participants running many races the final rank 
is determined by the score of each participant, obtained by adding its ranks in each 
individual race. The "Statistical Curse of the Second Half Rank" is the observation 
that if the score of a participant is even modestly worse than the middle score, 
then its final rank will be much worse (that is, much further away from the middle 
rank) than might have been expected. We give an explanation of this effect for 
the case of a large number of races using the Central Limit Theorem. We present 
exact quantitative results in this limit and demonstrate that the score probability 
distribution will be gaussian with scores packing near the center. We also derive 
the final rank probability distribution for the case of two races and we present some 
exact formulae verified by numerical simulations for the case of three races. The 
variant in which the worst result of each boat is dropped from its final score is also 
analyzed and solved for the case of two races. 



1 Introduction 

In competitive individual sports involving many participants it is in some cases standard 
practice to have several races and determine the final rank for each participant by taking 
the sum of its ranks in each individual race, thereby defining its score. By comparing the 
scores of the participants a final rank can be decided among them. Typical examples are 
regattas, which can involve a large number of sailing boats (~100), running a somehow 
large number of consecutive races (> 10). 

An empirical observation of long-time participants is that, if their scores are even 
slightly below the average, their final rank will be much worse than expected. This 
frustrating fact, which we may call the "Statistical Curse of the Second Half Rank", 
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is analyzed in this work and argued to be due to statistical fluctuations in the results 
of the races, on top of the inherent worth of the participants. Using some simplifying 
assumptions we demonstrate that it can be explained by a version of the Central Limit 
Theorem (TJ [2] for correlated random variables. A general result for a large number of 
participants and races is derived. Some exact resuts for a small number of races are 
presented. A variant of the problem, in which the worst score for each participant is 
dropped, is also considered and solved for the case of two races. 

2 Basic setup 

Consider n& boats racing n r races. A boat % in the race k has an individual rank e 
[ljJib] (lower ranks represent better performance). The score of the boat i is the sum 
n i = Y^k=i n i,k £ [ n r, n r nb) of its individual ranks in each race. The final rank of boat i is 
determined by the place occupied by its score rii among the scores of the other boats rij, 
with j i. 

For reasons of simplicity we assume that in a given race the ranks are uniformly 
distributed random variables with no exaequo (that is, all boats are inherently equally 
worthy and there are no ties). We shall also take the ranks in different races to be 
independent random variables. It follows that for the race k the set {n^i = 1,2, 
is a random permutation of {1, 2, n?,} so that the n^'s are correlated random variables 
(in particular Y^i=i n i,k — nb(n b + l)/2), while and are uncorrelated for k ^ k'. 
We are interested in the probability distribution for boat % to have a final rank m G [1, rib] 
given its score. 

Let us illustrate this situation in the simple case of three boats racing two races. We 
have to take all random permutations of {1, 2, 3} both for the first and the second race, 
and to add them to determine the possible scores of the three boats. It is easy to see that 
for, say, boat 1 to have a score n^i + = 4 there are twelve possibilities: 

i) four instances where n^i = 1 and n 1)2 = 3, 

ii) four instances where n^i = 2 and riip = 2, and 

iii) four instances where ni,i = 3 and nip = 1. 

In each of these three cases (i), (ii) and (iii), one finds that boat 1 has an equal probability 
1/2 for its final rank to be either m = 1 or m = 2. Its mean rank follows as (m) = 
1/2(1 + 2) = 3/2. Clearly the score 4 is precisely the middle of the set {2,3,4,5,6} and 
(m) = 3/2 is indeed closqj to the middle rank 2. 

More interestingly, cases (i), (ii) and (iii) give the same final rank probability distri- 
bution. It means that the final rank probability distribution depends only on the score of 
boat 1, and not on its individual ranks in each of the two races consistent with its score. 
This fact is particular to two races and would not be true any more for three or more 
races. The final rank probability distribution for boat 1 given its score would depend in 

1 The 1/2 discrepancy is due to the fact that boats with equal scores are all assigned the same final 
rank. E.g., two boats tying in the first place are assigned a rank of 1, while the next boat would have a 
rank of 3. If, instead, the two top boats were assigned a rank of 1.5 (the average of 1 and 2) we would 
have obtained (m) = 2. This effect, at any rate, will be important only for a small number of boats. 
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this case on the full set of its ranks in each race, and not just on its score. The final rank 
probability distribution should then be defined as the average of the above distributions 
for all set of ranks consistent with its score. 

To avoid this additional averaging and simplify slightly the analysis, we consider from 
now on rib boats racing n r races, plus an additional virtual boat which is only specified by 
its score n t G [n r , n r (rib + 1)]. We are interested in finding the probability distribution for 
this virtual boat to have a final rank m G [1, n& + 1] given its score n t when it is compared 
to the set of scores {nf,i = 1,2, of the rib boats. By definition this probability 

distribution will then depend only on three variables: rib, the number of boats; n r , the 
number of races; and n t , the score of the virtual boat we are interested in. 

3 The limit of many races 

The problem simplifies when some of the parameters determining the size of the system 
become large so that we can use central limit-type results. In this section we consider the 
limit in which the number of races becomes large. 

We start with a reminder of the Central Limit Theorem in the case of correlated 
random variables. Assume {x ijk ; i = 1, . . . , rib] k = 1,2,..., n r } to be correlated random 
variables such that 

• they are independent for different k, 

• the set {xi,k, x 2,k, ■■■■> x n b ,k} is distributed according to a joint density probablility 
distribution which is /c-independent and whose first two moments (mean and covari- 
ance) are (x i<k ) = pi and (x ijk x j<k ) - (x i<k )(x jjk ) = p tj . 

The CLT states that in the limit n r ^> 1 the summed variables Xi = Y^k=i x i,k are 
correlated gaussian random variables with (xj) = n r pi and (xiXj) — (xi){xj) = n r pij, that 
is, they are distributed in this limit according to the probability density 



where iV is a normalization constant. The matrix [A] is the inverse of the covariance 
matrix [p], assuming that [p] is non-singular. 

In the race problem, x^ k = and xi — nf. one has 



f(x u x 2 , ...,x 



) = 7Vexp[-— 



Xi - n r pi)(xj - n r pj)] 
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(off diagonal correlations are negative) so that 
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It follows that in the large number of races limit (n*) = rv^^l and (nifij) — (ni)(rij) = 

The covariance matrix [p] is singular with a single zero-eigenvalue eigenvector (1, 1, 1). 
Any vector perpendicular to (1, 1, 1), that is, such that the sum of its entries is 0, is 
an eigenvector with eigenvalue n r {n b + l)/2. The fact that (1, 1, 1) is a zero-eigenvalue 
eigenvector signals that the variable Ym=i n * = n r n b( n b + l)/2 is deterministic. It must 
be "taken out" of the set of the scores before finding the large n r limit. We arrive at the 
density probability distribution 
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such that indeed (rij) = n r pi and (niUj) — (rii)(nj) = n r pij. 

One can exponentiate the constraint 5(X^I=i( n « — + l)/2) so that 
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For a virtual boat with score n f the probability to have a final rank m is the probability 
for m — 1 boats among the n b s to have a score rii < n t and for the other n b — m + l's to 
have a score rij > 
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which obviously satisfies ^m=i-^W( m ) = 1 • It can be rewritten as 
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The probability distribution ( 1T2|) is of binomial form but with a /c-dependent 'pseudo- 
probability' Wn t (k), and k normally distributed according to ^/n&/ (2ir) exp[— nbk 2 /2). We 
find in particular 
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where M(x) is the cumulative probability distribution of a normal variable 
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this limit, n t scales like n b and thus n t is n h - independent: the n h dependence of P nt (m) 
is solely contained in the binomial coefficient and the exponents, not in w nt {k). Setting 
r = m/rib (the percentage rank) and using n\ ~ \ / 2nn(n/e) n we obtain 
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(16) 

In (TT5|) the exponent of the integrand is negative except when k = and r = w nt (k): for 
large n?, a saddle point approximation yields that P nt { r ) vanishes except when r is taken 
to be Wn t {0). It follows that the final rank of the virtual boat is essentially fixed by its 
score n t 

f = M(n t ) (17) 

as expected from (fT3"l [T41) in the large rib limit and shown in Fig. [T] for 200 boats racing 
30 races. 

The fluctuations of r around f are obtained by expanding the exponent in (fl6|) around 
r = f (one sets r ~ f + e) and around k = so that 
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where w' nt (0) is the derivative of w nt (k) at k = 0. The integration over A; in (1161) finally 
yields 
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which is gaussian distributed around e = 0, i.e. r = r, with variance r ( 1 r )+ w n f (°) ^ gj nce 
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Figure 1: The final rank of the virtual boat for 200 boats racing 30 races : the continuous 
line is ( ITT)) and the points are numerical simuations for a score nt ranging from 30 to 
6000 by steps of 500. Both data in the curve and the simulation points have been divided 
by 200. The "statistical curse (blessing) of the second (first) half rank" effect is clearly 
visible on the figure. 



and r = t% t (0), 1 — r = 1 — Wn t (0) = w_ nt (0) we eventually get for the variance 

(Am) 2 = n b( p(n t ) (21) 
In the above we introduced the Kollines function 

Mx) = Af(xW(-x) - — exp[-a; 2 ] (22) 

2n 

It is positive, very flat around x = (the first three derivatives vanish at x = 0) and is 
essentially zero when |x| > 3.5 (see Fig. [2]). 
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Figure 2: The Kollines function 

It follows that when \n t — n r (nt, + 1)/2| ^> 3.5/\/A (~ 3.5rib y/n r /12) the final rank has 
no fluctuation. It is only when \n t — n r (nt, + 1)/2\ < 3.5/\/Athat Am ~ y/n^ as illustrated 
in Fig. [3] for 200 boats racing 30 races. 
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Figure 3: The standard deviation of the final rank of the virtual boat for 200 boats racing 
30 races : the continuous line is the square root of the Kollines function and the points 
are numerical simuations for a score rit ranging from 30 to 6000 by steps of 500. 



4 Small race number: the case n r = 2 

The problem without the benefit of the large- n r limit becomes harder and, for generic n r , 
is not amenable to an explicit solution. For the case of few races, however, we can obtain 
exact results. 

In the present section we deal with the case n r = 2, for which we can find the exact 
solution. Fig. H] displays the mean final ranks and variances of the virtual boat for 
rib = 3, 4, .., 9 boats racing 2 races. For a given nj, the score of the virtual boat spans the 
interval [2, 2n& + 1]. 
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Figure 4: By complete enumeration of all permutations: the mean final rank and variance 
for 3, 4, 9 boats and 2 races. 
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Figure 5: The sketch of an event for n r = 2 and rib = Q. A boat is represented by a point 
whose coordinates are its ranks in the two races. Here, we fix the score nt = 6 of the 
virtual boat (dashed diagonal). There are 2 sites occupied in D. Thus, the rank of the 
virtual boat is to = 3 for this event. 



4.1 Sketch and basic properties 

For two races, the situation can be sketched by using a n b x n b square lattice as in Fig. [5] 
for n b = 6. 

The two coordinates correspond to the ranks of a boat in each one of the two races. 
So, each boat will be represented by an occupied site. It follows that each line and each 
column will be occupied once and only once. This leads to n b \ possible configurations. 

The score n t of the virtual boat is fixed and represented by the dashed diagonal. Let 
us call D the domain under the diagonal. The rank of the virtual boat is equal to to when 
(to — 1) sites are occupied in D. We have obviously P nt {m) = 5 ml when n t < 2 and 
P nt (m) = 5 m , nb+ i when n t > 2n b + 1. Moreover, from symmetry considerations, 

P nb +i-k{m) = P nb +2+k{n b + 2 - m) , k = 0, 1, n b - 1 (23) 

So, in the following, we will restrict n t to the range 2 < n t < n b + 1. In that case, it is 
easy to realize that only (n t — 2) columns (or lines) are available in D. This implies for 
m the restriction 1 < m < n t — 1. 

We also observe that the distribution is symmetric for n t = n b + 1 or n b + 2 

P nb+1 (m) = P nb +i(n b + 1 - to) = P nb +2{m + 1) , to = 1, 2, n b (24) 

We will come back to this point later. 

4.2 Direct computations of P nt (m) for some m 

For to = nt — 1, we observe (see Fig. [6]) that there is only one possibility to occupy the 
(n t — 2) sites in D. 

The (n b — n t + 2) remaining occupied sites are distributed randomly on the sites of the 
(nt,— n t + 2) remaining lines and columns that are still available. Denoting (u = n b — n t + 2) 
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Figure 6: A configuration contributing to P nt (m — n% — 1). We have only one possibility 
for the {n t — 2) occupied sites under the dashed diagonal. 
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Figure 7: A configuration contributing to P m (l). No occupied site belongs to D. For each 
line a), b), d), we have rib — n t + 2 possibilities for the occupied sites. The remaining 
occupied sites will generate the factor P nt (n t — 1). For further explanations, see the text. 



one obtains 

P nt (m = nt - 1) = —r (25) 
n&! 

Now, for m = l, there are no occupied sites in D. Let us fill (Fig. [7]) the lines, starting 
from the bottom. On line (a), we have — n t + 2 (= u) available sites; on line (b), we 
still have u available sites (because of the site occupied in line (a)); and so on, up to line 
(d). Moreover, from the u upper lines, we still get a factor u\. 
Finally 

P nt (l) = P nt (n t -l)-$ 1 (u) with $ x (w) = u n <- 2 (26) 

It is easy to see, from the above considerations, that, for 1 < m < n t — 1 

P nt (m) = P nt {n t -l)$ m {u) (27) 
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Figure 8: A configuration contributing to P nt (2). The occupied site, B, in D, has 
coordinates i and k. For further explanations, see the text. 



where $ m (w) is a polynomial in u with integer values^. 

For m = 2, there is one occupied site, B, in D. 
With the coordinates (i, k) defined in Fig. [HI D is the domain (0 < i < k — 1; 1 < k < 
n t — 2) so that 



P nt (2) = P nt (n t - l).Y,u nt - 2 -\u + l) fc -*-V = P nt (n t - f)$ 2 (n) 



D 



with 



$ 2 ( M ) = (u+ l) nt - 2 {u + 1) - u nt ~'\u + n t - 1) 



n t -2. 



(28) 
(29) 



The computation for m = 3 is more involved because the relative position of the two 
occupied sites in D plays an important role in the expression of the terms to be summed. 
One gets 

' (u + 2) nt ~ 2 {u + \){u + 2) - 2{u + l) nt -\u + 1)^ + ^-1) + 



2 

+ u rit ~ 2 {u + n t -l)(u + n t - 2) 



(30) 



It is worth noting that, despite the apparent complexity of $3(1*), the degree of § m (u) 
decreases when m increases. We will clarify this point later. 

The case m = 4 seems out of reach by direct computation and will not be pursued 
along these lines. 



4.3 Recursion relation and solution of the case n r = 2 

Looking at (|26| [29 | 1301) . we observe that, for m < 2, $ m (w) satisfies the recursion relation 

1 



m 



(u + l)$ m (w + 1) - (u + n t - m)<$> m {u) 



(31) 



2 This is not true for n r > 3. 
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Figure 9: The 3 ways for producing a configuration contributing to N'(m) (see the text 
for definition): i) start from a configuration contributing to N(m + 1), erase A{ and add 
A[ and A'-; ii) start from a configuration contributing to N(m), erase Bj and add Bj and 
Bj] iii) start from a configuration contributing to N(m) and add C . 



We will now show that fl3Tj) holds in general. 

Let us write P nt (m) = -^<& m (u) = N ^ where N(m) is the number of configurations 
of the rib x ^6 square with (m — 1) occupied sites in D. Changing rib into rib + 1 (which 
amounts to changing u into u + 1 while keeping n t unchanged), we call P^ t (m) the new 
probability distribution P^ t (m) = ^fi)! ®m(u + 1) = 7~^\j where N'(m) is defined like 
N(m) but for the (n& + 1) x (rib + 1) square lattice (Fig. [9]). 

N'(m) receives three kinds of contributions: 

i) Let us consider a configuration contributing to N(m + 1) (m occupied sites Ai in D 
- see Fig. [9]). The replacement of Ai by A- and A" produces a configuration contributing 
to N'(m) (only (m — 1) occupied sites in D; all the columns and lines of the biggest square 
are occupied once). Since we can choose any of the A^s before applying this procedure, 
we get a contribution mN(m + 1) to N'(m). 

ii) Let us next consider a configuration contributing to N(m) (rib + 1 — m occupied 
sites Bj in E - see Fig. [9]). By the same reasoning as in (i), we get (rib + 1 — m)N(m) 
configurations for N'(m). 

iii) To each configuration contributing to N(m), we can add an occupied site in C (see 
Fig. [H]). This produces the contribution N(m) to N'(m). 

Summing the above contributions leads to 

N'(m) = mN(m + 1) + (n b + 2 - m)N(m) (32) 

Reverting back to 5> m 's, it is straightforward to get fl3T|) . Equations (12T)|) and (13"TT) prove 
that <&m(w) has degree n t — m — 1. 



11 



Finally, solving the recursion equation, we get the exact solution for n r = 2 



m— 1 



P nt {m) = K + 1) ^(-i^K - n t + m- k + 1] 



n t -2 
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(rit — n t + m — k + 1)! 
fc!(n b -fc + l)!(m-fc-l)! 



(33) 



with 2<m + l<n t <n b + l understood. We have checked (13"3"l) by a complete 
enumeration of the permutations up to rib and n t = 10. 

Let us discuss the case n t — rib + 1. Equation fl33l) narrows down to 
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We recover the fact that P nt (m) is symmetric. These results will be especially useful in 
the next section. 



4.4 Computations of the first three moments for n t < rib 

Starting from the equation (1321) . we get 

mP nt (m + 1) = (n b + l)P' nt (m) - (n 6 + 2 - m)P nt (m) (39) 

(recall that P^ t {m) is the same as P nt {m) but for rib changed into n b + I). Multipying 
both sides of f)39p by m k and summing over m, the recursion equation for the moments 
follows 



fc-i 



(n b + 1 - k) < m k > + ^ 



p=0 



+ 1 -p)! 



< m P >= ( n6 + l) < m k >' (40) 



^< ... > refers to n& and < ... >' to n& + 1 ). 
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For k = 1, setting Z nb = n b < m >, we get Z nfc - Z nb _x = 1 and, finally, Z nb = 
Znt-i + n b — n t + 1. Computing with (!35]1. we obtain the first moment 

(n t -l)K-2) . . 

< m >= 1 + ^-J i 41 

The other moments are obtained in a similar way. Equations ( 137|) . ( 138|) and (|40i) lead to: 

< (m- < m >) 2 > = ( 7 2 ^ 1 (i^) 2) [ 3n * " n * (9 + 8n&) + 6(nb + 1)2 ] ' Ub ~ ^ 42) 

< (m- < m » 3 > = {Ht - 1){nt - ~ nb ~ 1)2( ^ ~ nb - 2)2 , n b >3 (43) 
1 ; 2nf(n 6 -l)(n 6 -2) 

As expected, < (m— < m >) 3 > vanishes for n t = n b + 1 or n b + 2 (the distribution 
is symmetric); < (m— < m >) 2 > and < (m— < m >) 3 > vanish for nt = 1 or 2 
(Pi, 2 (m) = 5 m ,i). 



5 The case n r > 3 

For the case of three or more races the problem is more complex. We can, however, 
establish some partial exact results. Fig. [10] demonstrates the stituation for three races, 
displaying the mean final ranks and variances of the virtual boat for n b = 3, 4, 5, 6 boats. 
The score of the virtual boat spans the interval [3, 3n b + 1]. 

For n r = 3 and nt < n b + 2, we established and checked numerically the recursion 
relation 

N'(m) = (m + l)mN(m + 2) + m(2n b - 2m + 3)JV(m + 1) + {n b - m + 2) 2 N(m) (44) 
More generally, for n r > 3, we obtained the expression 

(nt - 1)! 

< m >= 1 + ———— _ — - for n r < n t < n b + n r - 1 (45) 

n b r [nt - 1 - n r )\n r \ 




Figure 10: By complete enumeration of all permutations: the mean final rank and variance 
for 3, 4, 5 and 6 boats and 3 races. 
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6 Two races with the worst individual rank dropped 



We conclude our analysis with a variant of the original problem, also used in competitions, 
for the specific case of two races. 

Specifically, suppose that, for each boat, we drop the greatest rank (worst result) 
obtained in the two races. For instance, if the boat i had ranks n^i = 2 and n i 2 = 5, we 
only retain the score rij = 2. The virtual boat has a fixed score n t in the range [1, n& + 1] 
and, as before, its rank is m when (m — 1) boats have scores rii smaller than nt- 

It is obvious that m > n t . Indeed, without loss of generality, we can consider that the 
ranks n^i obtained in the first race are arranged in natural order: {1, 2, n b — 1, n b }, ie 
n i,i — i- (We will keep this order all along this section). Now, from < n^i, it is easy to 
realize that, at least (n t — 1) boats will have scores rij smaller than n t , thus m > n t . 
Defining the ordered sets A = {1, 2, nt — 2, n t — 1} and B = {n t , n t + 1, n b — 1, n b }, 
we see that, taking, for the ordereqj set of ranks in the second race, any permutation 
of A (for instance {nt — 2, 2, 1, n t — 1} ) followed by any permutation of B (for instance 
{n b — l,n t ,n t + 1, ...,n b } ), we construct all the configurations leading to m = n t . The 
number of such configurations is (rit — 1)! x (n^ — n t + 1)!. Dividing by the total number 
of configurations n b \, we get: 

PuMt) = , 1 N (46) 
n b 

n t -l 

For m > nt, we start from the naturally ordered sets A and B and exchange (m — n t ) 
elements of A with (m — n t ) elements of B (of course, m — n t < n t — 1 and m — rit < 
n b — n t + 1). So, we get the sets A' and B' . Taking, for the ordered set of ranks in 
the second race, any permutation of A' followed by any permutation of B', we get all 
the configurations leading to the rank m for the virtual boat. We eventually obtain a 
hypergeo metric law for the random variable (m — n t ) 

n t - 1 \ / n b - n t + 1 



. m — n+ J \ m — n+ 

P nt (m) = + / V x t —^- (47) 

I n b 

\nt-l 

with nt < m < min{2n t — 1, n b + 1} 

Of course, this probability density is quite different from the one obtained in 
In particular, it is interesting to note that the distribution (1471) is unchanged when we 
replace, simultaneously, n t by n' t = n b + 2 — n t and m by m! = m + n b + 2 — 2n t 

P n ,(m') = P nt (m) (48) 

(Note that n' t — 1 = n b + 1 — n t , n b — n' t + 1 = n t — 1 and m! — n' t = m — n t . So, from 
(JlTjl . P nt (m) is unchanged.) 



3 Here, "ordered" does not mean "in natural order" but simply that we take into account the order 
when we enumerate the elements of the set (i.e., {a, b, ...} =^ {b, a, ...}). 
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When n b is even, the distribution is symmetric for n t = ^ + 1. Indeed 



nb x 2 

2 

. m - ^ - 1 J /3n b \ ri h 

2i 
2 

The moments of (|47p are 

(n t -l)(n 6 -n t + l) 

< m > = n t + (50) 

n b 

<(m-<m>r>= {nt - 1)2 J nb -^ + 1) \ n b >2 (51) 



< (m- < m >) 3 > = - 1 1~ o~ J , n b > 3 (52) 



nf (n 6 - 1 

(n, - l) 2 (n fe -n t + l) 2 K- 2n t + 2) 2 
nf(n 6 - l)(n b - 2) 

consistent with (|48p . Moreover, as expected, < (m— < m >) 2 > and < (m— < m >) 3 > 
vanish for n t = 1 and n t = n b + 1. Finally, < (m— < m >) 3 > vanishes for n t = ^ + 1 
when is even (the distribution is symmetric, see (j4"9"]l ). 



7 Conclusions 

We demonstrated that the problem of determining the final rank distribution for a boat 
in a set of races given its total score can be explicitly solved in two distinct situations: 
for a large number of races, and for a few (2 or 3) races. We also demonstrated that 
the "Statistical Curse of the Second Half Rank" effect can be attributed to statistical 
averaging in the case of many races. 

Although we obtained our results in the context and language of boat racing, they are 
clearly applicable in several similar situations, such as, e.g., student ranks based on their 
results in many exams or quizes, rank of candidates for positions or awards when they 
are reviewed and ranked by many independent evaluators, and voting results when voters 
submit a rank of the choices. 

There are many open issues and unsolved problems for further investigation. The 
exact result for an arbitrary number of races (greater than 2) is not known. Further, the 
obtained results are based on the simplifying assumption that all boats are equally worthy 
(all ranks in each race are equally probable). One could examine the situation in which 
boats have a priori different inherent worths, handicapping the probabilities for the ranks, 
and see to what extent the "statistical curse" effect also emerges. Finally, the relevance 
and relation of our results with well-known difficulties in rank situations, such as Arrow's 
Impossibility theorem [2111], would be an interesting topic for further investigation. 
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